Word Sense Disambiguation Based on Structured Semantic Space
نویسندگان
چکیده
In this paper, we propose a framework, structured semantic space, as a foundation for word sense disarnbiguation tasks, and present a strategy to identify the correct sense of a word in some context based on the space. The semantic space is a set of multidimensional real-valued vectors, which formally describe the contexts of words. Instead of locating all word senses in the space, we only make use of mono-sense words to outline it. We design a merging procedure to establish the dendrogram structure of the space and give an heuristic algorithm to find the nodes (sense clusters) corresponding with sets of similar senses in the dendrogram. Given a word in a particular context' the context would activate some clusters in the dendrogram, based on its similarity with the contexts of the words in the clusters, then the correct sense of the word could be determined by comparing its definitions with those of the words in the clusters.
منابع مشابه
Latent Semantic Word Sense Induction and Disambiguation
In this paper, we present a unified model for the automatic induction of word senses from text, and the subsequent disambiguation of particular word instances using the automatically extracted sense inventory. The induction step and the disambiguation step are based on the same principle: words and contexts are mapped to a limited number of topical dimensions in a latent semantic word space. Th...
متن کاملDependency-Based Construction of Semantic Space Models
Traditionally, vector-based semantic space models use word co-occurrence counts from large corpora to represent lexical meaning. In this article we present a novel framework for constructing semantic spaces that takes syntactic relations into account. We introduce a formalization for this class of models, which allows linguistic knowledge to guide the construction process. We evaluate our frame...
متن کاملSemantic Relatedness for Biomedical Word Sense Disambiguation
This paper presents a graph-based method for all-word word sense disambiguation of biomedical texts using semantic relatedness as edge weight. Semantic relatedness is derived from a term-topic co-occurrence matrix. The sense inventory is generated by the MetaMap program. Word sense disambiguation is performed on a disambiguation graph via a vertex centrality measure. The proposed method achieve...
متن کاملOntology Based Query Expansion Using Word Sense Disambiguation
The existing information retrieval techniques do not consider the context of the keywords present in the user’s queries. Therefore, the search engines sometimes do not provide sufficient information to the users. New methods based on the semantics of user keywords must be developed to search in the vast web space without incurring loss of information. The semantic based information retrieval te...
متن کاملAutoExtend: Combining Word Embeddings with Semantic Resources
We present AutoExtend, a system that combines word embeddings with semantic resources by learning embeddings for non-word objects like synsets and entities and learning word embeddings which incorporate the semantic information from the resource. The method is based on encoding and decoding the word embeddings and is flexible in that it can take any word embeddings as input and does not need an...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1997